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Abstract 


• As NASA has evolved it’s usage of spaceflight 
computing, memory applications have followed 
as well. In this talk, we will discuss the history of 
NASA’s memories from magnetic core and tape 
recorders to current semiconductor approaches. 
We will briefly describe current functional 
memory usage in NASA space systems followed 
by a description of potential radiation-induced 
failure modes along with considerations for 
reliable system design. 
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Outline of Presentation 

• Introduction - The Space Memory Story 

- A look at how we got here 

• General Applications of Memories in Space 
Systems 

• Requirements and Desirements 

• Example: SDRAMs 

- Radiation Failure Modes - Single Events 

- Design Approaches 

• Reliability Considerations 

• Summary 
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"Qn.ce. upon, a time... 


• There once was a fledgling memory used for space 

- It started out as core memory (60’s-70’s) 

- Grew into magnetic tape (70’s-80’s) 

- And has settled into “silicon” solid state recorders or 
SSRs (90’s and beyond) 

• While this is true for mass data storage, silicon has been 
used since the 70’s for some memory applications such as 
computer programs and data buffers 

• Both volatile and non-volatile memories (NVMs) are used 



- 4 kB of Magnetic core r/w memory P87-2 circa 1990 

- 1 st known spaceflight SSR 
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Sample Single Event Upset (SEU) 
hiccups along the way 



An original space SEU detector, the 93L422 - bit errors 
in space 

- TDRS-1 anomalies, for example 

• “Solved” by use of error detection and correction (EDA 
codes 

• Used as “gold standard” on multiple flight experiments 
(CRRES and MPTB) 

Single event functional interrupts - SEFIs 

- Device has a functional anomaly 
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Stuck bits 

Multi-bit/multi-cell upsets 

Block errors 

Small probability events 

- Proton ground test of 3 samples 

- Flight SSR had >1000 

- Anomaly in-flight traced to low-probability event 
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Categories of Memory Usage for Space 



• Computer program storage 

- Boot, application, safehold 

- Often a mix of volatile and non-volatile memories 

• Store in NVM, download on boot to RAM, run out of RAM 
- Size, Weight, and Power (SWaP) - RAM is faster than NVM 

• Temporary data buffers 

- Accommodates burst operations 

• Data Storage such as SSR 

- E.g. mass storage area for science or spacecraft telemetry 

- Usually write once an orbit, read once an orbit 

- Trend to want to use NVM for SSR 

• Configuration storage for volatile Field 
Programmable Gate Arrays (FPGAs) 

- Becoming a bigger problem as FPGAs increase their 
needs 
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The Volatile Memory for Space 


• Rad Hard Offerings are limited to SRAM 

- 16 Mb maximum single die 

- These tend to be medium speed and relatively high 
power when compared to commercial equivalents 

• For comparison, first SSR in 1990 used 256 Mb commercial 
die 

- Still used extensively in rad hard computer offerings, 
but many designs have transitioned to DRAM options 

• Mid-1990’s = transition in SSRs from commercial 
SRAM to DRAM 

- SDRAM are in-flight and many current designs have 
begun to use dual data rate (DDR) and DDR2 
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The SDRAM Quandary 



• Many space designs are baselining/using DDR and DDR2 
interfaces for hardware builds 

- Problem: DDR3 expected to dominate commercial product 
starting, in 2010[ 

• Do we support current system designs or product 
development timelines? 

- Will DDR2 be obsolete by system readiness dates? 


DRAM Shipment By Technology 
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Data Rate 
Interface 

Source Sync 

Burst Length 

CL/tRCD/tRP 

Reset 

ODT 

Driver Calibration 
Leveling 


DDR Performance Metrics 



DDR 

200-400Mbps 

SSTL_2 

Bidirectional 

DQS 

(Single ended default) 

BL= 2, 4, 8 
(2 bit prefetch) 

15ns each 
No 
No 

No 

No 

1.5V 


DDR2 

400-800Mbps 

SSTL_18 

Bidirectional 

DQS 

(Single/Diff Option) 

BL= 4 , 8 
(4bit prefetch) 

15ns each 
No 
Yes 

Off-Chip 

No 

1.25V 


DDR3 

800-1 600Mbps 

SSTL_1 5 

Bidirectional 

DQS 

(Differential default) 

BL= 4, 8 
(8bit prefetch) 

12ns each 
Yes 
Yes 

On-Chip with ZQ pin 
Yes 

i.ov 
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The Non-Volatile Memory for Space 

• Rad Hard offerings are limited to small SONOS or CRAM devices 

- Used in many RH processor systems that do not have large program memory 
space requirement - 4-16 Mb per die maximum 

• Evolution of commercial NVM in space 

- PROMS 

• Older commercial PROMS were reasonably good, but one-time programmable (OTP) 

- EPROMs 

• Used in a few systems in the 90’s, but had TID issues 

- EEPROMs 

• In use from the 90’s to today, despite both TID and SEE (write mode) concerns 

- SEEQ 256 Mb (now obsolete) 

- Hitachi 1 Mb (now sold through re-packaging/screening houses) 

- Flash 

• The latest “in vogue” commercial NVM due to density (32Gb die coming soon) 

- Much improved TID than older EEPROMs 

- SEFI and SEL still issues 

• Some space system primes are planning on using these in SSR applications 

“In summary, the Signetics PROMs are recommended (given previous total dose studies) 
for usage as are the SEEQ EEPROMs during read operations, it is not recommended, 
pending further investigation, to use the SEEQ EEPROMs for in-flight programming. “ 

- La Bel, Nov 1990 Test Report 
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Alternate Material NVMs 


Alternate material NVMs - evaluated as devices become available 

- Expect cell integrity to perform fairly well under irradiation on most 
NVMs 


- LaBel’s Truism: 


• There are ALWA YS more challenges in “qualifying” a new technology 
device than expected 


Phase change memories (PCM) 

- Density, speed, and power look promising 

• Temperature is the challenge 

- Ex., Samsung, Numonyx - initial data taken 

CNT 

MRAM 



Numonyx PCM - 
Tech transfer opportunity? 


- Spin Torque appears to improve SWaP metrics 

- Ex., Avalanche Technologies 

Resistive Memories 

- Ex., Unity Semiconductor, HP Labs 

• Unity’s talking about a 64Gb device by next summer! 

NVSRAMs 


- Ex. Cypress 
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The Changing World of Radiation Testing of Memories 

Comparing SEE Testing of Commercial Memories - 1996 to 2006 



Device under test (DUTs): 
Commercial Memory 

- For use in solid state 
recorder (SSR) applications 

1996 

- SRAM memory 

• 1 um feature size 

• 4 Mbits per device 

• <50 MHz bus speed 

• Ceramic packaged DIP or 
LCC or QFP 

2006 

- DUT: DDR2 SDRAM 

• 90 nm feature size 

• 1 Gbit per device 

• >500 MHz bus speed 

• Plastic FcBGA or TSOP 

• Hidden registers and modes 

• Built-in microcontroller 


Sample Issues for SEE 
Testing 

- Size of memory 

• Drives complexity on tester side for 
amount of storage, real time processing, 
and length of test runs 

- Speed 

• Difficult to test at high-speeds reliably 

- Need low-noise and high-speed test 
fixture 

• Classic bit flips (memory cell) extended 
to include transient propagation (used to 
be too slow a device to respond) 

• Thermal and mechanical issues (testing 
in air/vacuum) 

- Packaging 

• Modern devices present problems for 
reliable test board fixture, die access 
(heavy ion tests) requiring expensive 
facility usage or device 
repackaging/thinning 

• Difficulty in high-temp testing (worst- 
case) 

- Hidden registers and modes 

• Functional interrupts driving 
“anomalous data” 

Not just errors to memory cells! 

- Microcontroller 

• Not just a memory 


Commercial memory testing is a lot more complex than in the old days! 
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Can we test anything completely? 



Sample Single Event Effect Test Matrix 


full generic testing 


Amount 

Item 

3 

Number of Samples 

68 

Modes of Operation 

4 

Test Patterns 

3 

Frequencies of Operation 

3 

Power Supply Voltages 

3 

Ions 

3 

Hours per Ion per Test Matrix Point 


66096 Hours 

2754 Days 

7.54 Years 

and this didn’t include temperature variations!!! 


Test planning requires much more thought in the modern age 
as does understanding of data collected (be wary of databases). 
Only so much can be done in a 12 hour beam run - application-oriented 
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The “Perfect” Space Memory 



• SWaP rules! 

- No power, Infinite density, Fast (sub 2 ns R/W access) 

• Oh, and Rad Hard (RH) 

• Okay, so this isn’t happening! 

- Speed: 

• Needs to be fast enough for burst data capture and not a 
bottleneck for processor interfaces 

- Power: 

• This is a trade space that includes thermal (stacking, for example) 

• NVM is good since no power consumed when not being accessed 

- Density: 

• Gb regime per die - anything beyond 100 Mb is acceptable! 

• Biggest RH devices currently ~16 Mb regime 

- Note: 1 st SSR used 256 Mb commercial SRAMs 20 years ago!!! 

• And a personal diatribe: how many operating modes do we 
really need? 

- Byte/nibble and page modes 

• Erase for NVM 
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Radiation Requirements (and trends) 



* How radiation hard to we really need? 

- TID 

• >90% of NASA applications are <100 krads-Si in piecepart 
requirements 

- Many commercial devices (NVM and SDRAMs) meet or come close to 
this. 

- Charge pump TID tolerance has improved ~ an order magnitude or 
more over the last 10 years 

• There are always a few programs with higher level needs and, of 
course, defense needs 

- SEL 

• Prefer none or rates that are considered low risk 

- Latent damage (“non-destructive” event) is a bear to deal with 

• As we’re packing cells tighter and even with lower Vdd, we’re 
seeing SEL on commercial devices regularly (<90nm) 

- Often in power conversion, I/O, or control areas 

- SEU 

• It’s not the bit errors, it’s the SEFIs and uncorrectable errors that 
are the biggest issues 

- Scrubbing concerns for risk, power, speed... 
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Reliability Considerations 

• Besides the usual CMOS concerns, memories have a 
few other considerations 

- Data retention 

• Long-term holding of values and/or requirement to refresh 
values 

- Endurance 

• Ability to read and write values N times (10 5 cycles is typical 
commercial NVM spec, for example) 

- Bit disturb (usually with Flash) 

• l.e., read/write/erase of bit A disturbs values on adjacent bit-line 

- Note: Many memories have “bad bits” to begin with that are 
mapped 

* Now add in unique space requirements 

- >10 year mission life 

- Colder and hotter temperatures (-55 to +125C) 

- Radiation 



Memory for Space - Presented by Kenneth A. LaBel, Fault Tolerant Computing, Albuquerque, NM May 25, 2010 


16 


Welcome to the Devil’s Radiation Playground 

SDRAM and SEE 



Errors range from 

- Soft(SEUs) 

- Single bit 

- Multi-bit (MBU). Multi-cell (MCU) 

- Destructive (SEL) 

- Disruptive (SEFI) 

• May require power cycle to restore 
functionality (SEFI-PC) 

- SEUs are a mere nuisance by 

comparison 

SEE rates depend on pattern, 
operating frequency, 
operating mode... 

- Beam Daddy estimates 7.5+ years for 
exhaustive test! 

• Typical allotted time: 12-24 hours 

Reality: All SDRAM tests are 
application specific 
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SDRAM Applications 

Flight Data Recorder 


Processor 

Scratchpad 


FPGA 


Memory 


Memory 



- Memory 


Memory 


- Memory 


Memory 



- Memory 


Payload Data 
Record er 


Memory 


- Memory 


- Memory 


L Memory 


Memory 


- Memory 


L Memory 


Size 

<10 Gbit 

>100 Gbit 

>500 Gbit 

Mitigation 

Limited 

Good 

Good 

Reliability 

High 

High 

High 

Availability 

High 

High 

High 

Integrity 

High 

Moderate to High 

Moderate 

Retention 

Short 

Long 

Long 


Remember, it’s only uncorrectable errors that are the problem 
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Mitigation Can Occur at Many Levels 

Examples 



• Single device 

- Interleaved bits 

• Physical separation of bits so that a single energetic 
particle doesn’t upset multiple bits in the same word) 

- Error correcting codes (ECC) 

- Row or column redundancy 

• Multiple device 

- Triple modular redundancy (TMR) - voting 

- External Error Detection And Correction (EDAC) 

• Hardware, FPGA, or Software 

• Block 

- Page mapping (map around “fails”) 

- Ping pong buffers 

- Spare(s) 
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Mitigation: What We Can and Can’t Do 



Not So Great 






Destructive Failure 
Consequences: 
Permanent loss of 
one memory die 
Mitigation: 

1)Find Immune Part 
2}Redundancy 

SEFj 

Consequences: 

Loss of functionality 
of one memory die 
Mitigation: 

1) Find Immune Part 

2) Soft Reset 

3) Cyc/e Power 


Pretty Darned Good 


Data Corruption 

Consequences: 

Loss of up to all data on a single memory die 
Mitigation: 

1) Triplicate voting + Error Scrubbing 

Overhead: 200% 

2) ED AC + bit interleaving + Error Scrubbing 

Hamming Code — Ex., <1 bit; 20% 
overhead 

Reed-Solomon — Ex., < 2 nibbles; 50% 
overhead 

Bonus 

Data loss also corrected for SEU, MBU, 
and even stuck bits 
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Considerations 



• Technology changes in memories engender 
challenges 

- Impact of new materials and manufacturing methods on 
radiation response and modeling 

- Increasing difficulty in die accessibility 

- Increasing operating speeds and operating modes 

- More hidden “features” and limited testability 

- Multi-level storage cells (Flash, for example) 

- Unique reliability concerns 

• We need to invest to keep ahead of the curve 

- DDR3 tests now? 

- PCM 

- ST MRAM 

- Reliability on RRAM, etc... 

• It’s the challenges the keeps us employed! 


We are always open to working with others 
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